{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "635d8ebb",
   "metadata": {},
   "source": [
    "# Conversation-With-History\n",
    "\n",
    "- Author: [Sunworl Kim](https://github.com/sunworl)\n",
    "- Design:\n",
    "- Peer Review: [Yun Eun](https://github.com/yuneun92)\n",
    "- Proofread : [Yun Eun](https://github.com/yuneun92)\n",
    "- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)\n",
    "\n",
    "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/12-RAG/05-Conversation-With-History.ipynb)[![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/12-RAG/05-Conversation-With-History.ipynb)\n",
    "## Overview\n",
    "\n",
    "This tutorial provides a comprehensive guide to implementing **conversational AI systems** with memory capabilities using LangChain in two main approaches.\n",
    "\n",
    "**1. Creating a chain to record conversations**\n",
    "\n",
    "- Creates a simple question-answering **chatbot** using ```ChatOpenAI```.\n",
    "\n",
    "- Implements a system to store and retrieve conversation history based on session IDs.\n",
    "\n",
    "- Uses ```RunnableWithMessageHistory``` to incorporate chat history into the chain.\n",
    "\n",
    "\n",
    "**2. Creating a RAG chain that retrieves information from documents and records conversations**\n",
    "\n",
    "- Builds a more complex system that combines document retrieval with conversational AI. \n",
    "\n",
    "- Processes a **PDF document** , creates embeddings, and sets up a vector store for efficient retrieval.\n",
    "\n",
    "- Implements a **RAG chain** that can answer questions based on the document content and previous conversation history.\n",
    "\n",
    "\n",
    "### Table of Contents\n",
    "\n",
    "- [Overview](#overview)\n",
    "- [Environment Setup](#environment-setup)\n",
    "- [Creating a Chain that remembers previous conversations](#creating-a-chain-that-remembers-previous-conversations)\n",
    "  - [1. Adding Chat History to the Core Chain](#1-adding-chat-history-to-the-core-chain)\n",
    "  - [2. Implementing RAG with Conversation History Management](#2-implementing-rag-with-conversation-history-management)\n",
    "\n",
    "\n",
    "### References\n",
    "\n",
    "- [Langchain Python API : RunnableWithMessageHistory](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html)\n",
    "- [Langchain docs : Build a Chatbot](https://python.langchain.com/docs/tutorials/chatbot/) \n",
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c6c7aba4",
   "metadata": {},
   "source": [
    "## Environment Setup\n",
    "\n",
    "Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.\n",
    "\n",
    "**[Note]**\n",
    "- ```langchain-opentutorial``` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. \n",
    "- You can checkout the [```langchain-opentutorial```](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "21943adb",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
    "%pip install langchain-opentutorial"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "f25ec196",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install required packages\n",
    "from langchain_opentutorial import package\n",
    "\n",
    "package.install(\n",
    "    [\n",
    "        \"langsmith\",\n",
    "        \"langchain\",\n",
    "        \"langchain_core\",\n",
    "        \"langchain_community\",\n",
    "        \"langchain_text_splitters\",\n",
    "        \"langchain_openai\",\n",
    "    ],\n",
    "    verbose=False,\n",
    "    upgrade=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "7f9065ea",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Environment variables have been set successfully.\n"
     ]
    }
   ],
   "source": [
    "# Set environment variables\n",
    "from langchain_opentutorial import set_env\n",
    "\n",
    "set_env(\n",
    "    {\n",
    "        \"OPENAI_API_KEY\": \"\",\n",
    "        \"LANGCHAIN_API_KEY\": \"\",\n",
    "        \"LANGCHAIN_TRACING_V2\": \"true\",\n",
    "        \"LANGCHAIN_ENDPOINT\": \"https://api.smith.langchain.com\",\n",
    "        \"LANGCHAIN_PROJECT\": \"Conversation-With-History\"  \n",
    "    }\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "690a9ae0",
   "metadata": {},
   "source": [
    "You can alternatively set API keys such as ```OPENAI_API_KEY``` in a ```.env``` file and load them.\n",
    "\n",
    "[Note] This is not necessary if you've already set the required API keys in previous steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "4f99b5b6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load API keys from .env file\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "load_dotenv(override=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa00c3f4",
   "metadata": {},
   "source": [
    "## Creating a Chain that remembers previous conversations\n",
    "\n",
    "Background knowledge needed to understand this content : [RunnableWithMessageHistory](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html#runnablewithmessagehistory)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b2fc536",
   "metadata": {},
   "source": [
    "### 1. Adding Chat History to the Core Chain\n",
    "\n",
    "- Implement ```MessagesPlaceholder``` to incorporate conversation history\n",
    "\n",
    "- Define a prompt template that handles user input queries\n",
    "\n",
    "- Initialize a ```ChatOpenAI``` instance configured to use the **ChatGPT** model\n",
    "\n",
    "- Construct a chain by connecting the prompt template, language model, and output parser\n",
    "\n",
    "- Implement ```StrOutputParser``` to format the model's response as a string"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1b78d33f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
    "from langchain_community.chat_message_histories import ChatMessageHistory\n",
    "from langchain_core.runnables.history import RunnableWithMessageHistory\n",
    "from langchain_openai import ChatOpenAI\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "\n",
    "# Define the chat prompt template with system message and history placeholder\n",
    "prompt = ChatPromptTemplate.from_messages(\n",
    "    [\n",
    "        (\n",
    "            \"system\",\n",
    "            \"You are a Question-Answering chatbot. Please provide an answer to the given question.\",\n",
    "        ),\n",
    "        # Note: Keep 'chat_history' as the key name for maintaining conversation context\n",
    "        MessagesPlaceholder(variable_name=\"chat_history\"),\n",
    "        # Format user question as input variable {question}\n",
    "        (\"human\", \"#Question:\\n{question}\"),\n",
    "    ]\n",
    ")\n",
    "\n",
    "# Initialize the ChatGPT language model\n",
    "llm = ChatOpenAI()\n",
    "\n",
    "# Build the processing chain: prompt -> LLM -> string output\n",
    "chain = prompt | llm | StrOutputParser()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9e4d831",
   "metadata": {},
   "source": [
    "Creating a Chain with Conversation History (```chain_with_history```)\n",
    "\n",
    "- Initialize a dictionary to store conversation session records\n",
    "\n",
    "- Create the function ```get_session_history``` that retrieves chat history by session ID and creates a new ```ChatMessageHistory``` instance if none exists\n",
    "\n",
    "- Instantiate a ```RunnableWithMessageHistory``` object to handle persistent conversation history\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0874c14b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize an empty dictionary to store conversation sessions\n",
    "store = {}\n",
    "\n",
    "# Get or create chat history for a given session ID\n",
    "def get_session_history(session_ids):\n",
    "    print(f\"[Conversation Session ID]: {session_ids}\")\n",
    "    \n",
    "    if session_ids not in store:     \n",
    "        # Initialize new chat history for this session\n",
    "        store[session_ids] = ChatMessageHistory()\n",
    "    return store[session_ids]  # Return existing or newly created chat history\n",
    "\n",
    "# Configure chain with conversation history management\n",
    "chain_with_history = RunnableWithMessageHistory(\n",
    "    chain,\n",
    "    get_session_history,  \n",
    "    input_messages_key=\"question\",  # User input variable name\n",
    "    history_messages_key=\"chat_history\",  # Conversation history variable name\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d7c108df",
   "metadata": {},
   "source": [
    "Process the initial input."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a2d22b26",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Conversation Session ID]: abc123\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Hello Jack! How can I help you today?'"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain_with_history.invoke(\n",
    "\n",
    "    # User input message\n",
    "    {\"question\": \"My name is Jack.\"},\n",
    "    \n",
    "    # Configure session ID for conversation tracking\n",
    "    config={\"configurable\": {\"session_id\": \"abc123\"}},\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25a0901c",
   "metadata": {},
   "source": [
    "Handle Subsequent Query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ec89414f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Conversation Session ID]: abc123\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Your name is Jack.'"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain_with_history.invoke(\n",
    "\n",
    "    # User follow-up question\n",
    "    {\"question\": \"What is my name?\"},\n",
    "\n",
    "    # Use same session ID to maintain conversation context\n",
    "    config={\"configurable\": {\"session_id\": \"abc123\"}},\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5fc43b99",
   "metadata": {},
   "source": [
    "### 2. Implementing RAG with Conversation History Management\n",
    "\n",
    "Build a PDF-based Question Answering system that incorporates conversational context.\n",
    "\n",
    "Create a standard RAG Chain, ensuring to include ```{chat_history}``` in the prompt template at step 6.\n",
    "\n",
    "- (step 1) Load PDF documents using ```PDFPlumberLoader```\n",
    "\n",
    "- (step 2) Segment documents into manageable chunks with ```RecursiveCharacterTextSplitter```\n",
    "\n",
    "- (step 3) Create vector embeddings of text chunks using ```OpenAIEmbeddings```\n",
    "\n",
    "- (step 4) Index and store embeddings in a ```FAISS``` vector database\n",
    "\n",
    "- (step 5) Implement a ```retriever``` to query relevant information from the vector database\n",
    "\n",
    "- (step 6) Design a QA prompt template that incorporates **conversation history** , user queries, and retrieved context with response instructions\n",
    "\n",
    "- (step 7) Initialize a ```ChatOpenAI``` instance configured to use the ```GPT-4o``` model\n",
    "\n",
    "- (step 8) Build the complete chain by connecting the retriever, prompt template, and language model\n",
    "\n",
    "The system retrieves relevant document context for user queries and generates contextually informed responses."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d1221d80",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
    "from langchain_community.document_loaders import PDFPlumberLoader\n",
    "from langchain_community.vectorstores import FAISS\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.prompts import PromptTemplate\n",
    "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
    "from langchain_community.chat_message_histories import ChatMessageHistory\n",
    "from langchain_core.runnables.history import RunnableWithMessageHistory\n",
    "from operator import itemgetter\n",
    "\n",
    "loader = PDFPlumberLoader(\"data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf\") \n",
    "docs = loader.load()\n",
    "\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)\n",
    "split_documents = text_splitter.split_documents(docs)\n",
    "\n",
    "embeddings = OpenAIEmbeddings()\n",
    "\n",
    "vectorstore = FAISS.from_documents(documents=split_documents, embedding=embeddings)\n",
    "\n",
    "retriever = vectorstore.as_retriever()\n",
    "\n",
    "prompt = PromptTemplate.from_template(\n",
    "    \"\"\"You are an assistant for question-answering tasks. \n",
    "Use the following pieces of retrieved context to answer the question. \n",
    "If you don't know the answer, just say that you don't know.\n",
    "\n",
    "#Previous Chat History:\n",
    "{chat_history}\n",
    "\n",
    "#Question: \n",
    "{question} \n",
    "\n",
    "#Context: \n",
    "{context} \n",
    "\n",
    "#Answer:\"\"\"\n",
    ")\n",
    "\n",
    "llm = ChatOpenAI(model_name=\"gpt-4o\", temperature=0)\n",
    "\n",
    "chain = (\n",
    "    {\n",
    "        \"context\": itemgetter(\"question\") | retriever,\n",
    "        \"question\": itemgetter(\"question\"),\n",
    "        \"chat_history\": itemgetter(\"chat_history\"),\n",
    "    }\n",
    "    | prompt\n",
    "    | llm\n",
    "    | StrOutputParser()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "927cac10",
   "metadata": {},
   "source": [
    "**Implementing Conversation History Management**\n",
    "\n",
    "- Initialize the ```store``` dictionary to maintain conversation histories indexed by session IDs, and create the ```get_session_history``` function to retrieve or create session records\n",
    "\n",
    "- Create a ```RunnableWithMessageHistory``` instance to enhance the RAG chain with conversation tracking capabilities, handling both user queries and historical context"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2fa5e0d7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Dictionary for storing session records\n",
    "store = {}\n",
    "\n",
    "# Retrieve session records by session ID\n",
    "def get_session_history(session_ids):\n",
    "    print(f\"[Conversation Session ID]: {session_ids}\")\n",
    "\n",
    "    if session_ids not in store:\n",
    "        # Initialize new ChatMessageHistory and store it\n",
    "        store[session_ids] = ChatMessageHistory()\n",
    "    return store[session_ids]  \n",
    "\n",
    "# Create RAG chain with conversation history tracking\n",
    "rag_with_history = RunnableWithMessageHistory(\n",
    "    chain,\n",
    "    get_session_history,  # Session history retrieval function\n",
    "    input_messages_key=\"question\",  # Template variable key for user question\n",
    "    history_messages_key=\"chat_history\",  # Key for conversation history\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d2753835",
   "metadata": {},
   "source": [
    "Process the first user input."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef397b71",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Conversation Session ID]: rag123\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\"The three key components necessary to achieve 'trustworthy AI' in the European approach to AI policy are: (1) compliance with the law, (2) fulfillment of ethical principles, and (3) robustness.\""
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rag_with_history.invoke(\n",
    "\n",
    "    # User query for analysis\n",
    "    {\"question\": \"What are the three key components necessary to achieve 'trustworthy AI' in the European approach to AI policy?\"},\n",
    "\n",
    "    # Session configuration for conversation tracking\n",
    "    config={\"configurable\": {\"session_id\": \"rag123\"}},\n",
    "\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15026fb6",
   "metadata": {},
   "source": [
    "Execute the subsequent question."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11c37944",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Conversation Session ID]: rag123\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Los tres componentes clave necesarios para lograr una \"IA confiable\" en el enfoque europeo de la política de IA son: (1) cumplimiento de la ley, (2) cumplimiento de principios éticos y (3) robustez.'"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rag_with_history.invoke(\n",
    "\n",
    "    # Request for translation of previous response\n",
    "    {\"question\": \"Please translate the previous answer into Spanish.\"},\n",
    "\n",
    "    # Session configuration for maintaining conversation context\n",
    "    config={\"configurable\": {\"session_id\": \"rag123\"}},\n",
    "    \n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "3.11.9",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
